-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding tests for Liveness probes #12497
Conversation
af91fcc
to
7507034
Compare
7507034
to
bbf1850
Compare
Codecov Report
@@ Coverage Diff @@
## main #12497 +/- ##
==========================================
+ Coverage 87.46% 87.48% +0.01%
==========================================
Files 195 195
Lines 9671 9718 +47
==========================================
+ Hits 8459 8502 +43
- Misses 928 931 +3
- Partials 284 285 +1
Continue to review full report at Codecov.
|
cc @julz @nader-ziada |
looks good to me, @psschwei can you take a look as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
||
// If true sleeping till the first kubelet probe check. | ||
if tc.sleep { | ||
time.Sleep(15 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is trying to surface the bug we hit in #12462 by making sure the pod stays up for at least 15 seconds even if a liveness probe is configured, but I wouldn't have guessed that without remembering about that issue. I wonder if there's a way to do this that (a) is a bit clearer about what it's testing -- i.e. that a pod with a liveness probe is accessible while that probe returns true (and the probe actually runs), and then is restarted if it fails (b) ideally avoids a hardcoded sleep.
What about if the test image counted how many times its liveness handler had been invoked and returned that number in the response? Then we could WaitForEndpointState
until the liveness probe had been executed at least N times (avoiding the sleep). We could then - maybe as a follow up - have the test POST to an endpoint on the test image that would cause the liveness check to fail, and assert that the container is properly restarted (this is similar to how we test readiness probes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense - will do these changes, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@julz just for clarity, if the liveness probe fails and the kubelet does see it, even though it restarts the containers, the readiness probe fails, and so the liveness probe check will also hang forever because of that. I am not really sure why the readiness probe is failing though - checking on that. (this is without any user provided readiness probe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be the case where I am just adding bad livenesProbe or readinessProbe and going into the probe pitfalls, but I have tried to capture more of it here: #12571
/hold |
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Shashankft9 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
@Shashankft9 you still have a hold on the PR - is it ready for another review? |
@dprotaso this is blocked by couple of issues: |
@Shashankft9: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This Pull Request is stale because it has been open for 90 days with |
Fixes #12480
Proposed Changes
test_images
to accommodate for liveness handler